{ggvfields}

Principled Tools for Modeling and Visualizing 2D Vector Fields

ggvfields

Reintroduce the problem

Vector fields

ggvfields can be understood by analogy with core ggplot2 concepts

Real-valued functions of real arguments \(f: \mathbb{R}\to \mathbb{R}\) are familiar from calculus

Their graphs are fundamental to their understanding (eg. \(f(x) = \sin(x)\))

Vector fields

ggvfields can be understood by analogy with core ggplot2 concepts

Real-valued functions of real arguments \(f: \mathbb{R}\to \mathbb{R}\) are familiar from calculus

Their graphs are fundamental to their understanding (eg. \(f(x) = \sin(x)\))

Vector fields

ggvfields can be understood by analogy with core ggplot2 concepts

Real-valued functions of real arguments \(f: \mathbb{R}\to \mathbb{R}\) are familiar from calculus

Their graphs are fundamental to their understanding (eg. \(f(x) = \sin(x)\))

f <- function(x) sin(x)
ggplot() +
  geom_function(fun = f, xlim = c(-pi, pi))

Vector fields

ggvfields can be understood by analogy with core ggplot2 concepts

Vector fields are vector-valued functions of vector arguments \(f: \mathbb{R}^{n} \to \mathbb{R}^{m}\)

In this talk we’ll assume \(n = m = 2\), a common case in applications

Thought of in various ways

\(\displaystyle{\textbf{f}(x,y) = \textbf{f}\left(\left[\begin{matrix}x \\ y \end{matrix}\right]\right) = \left[\begin{matrix}f_{1}(x,y) \\ f_{2}(x,y) \end{matrix}\right]}\) “component functions”

Examples:

\(\displaystyle{\textbf{f}(x,y) = \textbf{f}\left(\left[\begin{matrix}x \\ y \end{matrix}\right]\right) = \left[\begin{matrix}-y \\ x \end{matrix}\right]}\)

\(\displaystyle{\nabla \ell(\mu,\sigma^{2}) = \left[\begin{matrix}\frac{\partial}{\partial\mu} \ell(\mu,\sigma^{2}) \\ \frac{\partial}{\partial\sigma^{2}} \ell(\mu,\sigma^{2}) \end{matrix}\right]}\)

Current R solutions

f <- function(u) {
  x <- u[1]; y <- u[2]
  c(-y, x)
}

N <- 11
df <- expand.grid( "x" = seq(-1, 1, len = N), "y" = seq(-1, 1, len = N) )
df$fy <- df$fx <- NA

for (i in 1:nrow(df)) df[i,3:4] <- f( as.numeric(df[i,1:2]) )

df <- transform(df, "radius" = sqrt(fx^2 + fy^2), "angle" = atan2(fy, fx))

head(df)
#      x  y fx   fy   radius      angle
# 1 -1.0 -1  1 -1.0 1.414214 -0.7853982
# 2 -0.8 -1  1 -0.8 1.280625 -0.6747409
# 3 -0.6 -1  1 -0.6 1.166190 -0.5404195
# 4 -0.4 -1  1 -0.4 1.077033 -0.3805064
# 5 -0.2 -1  1 -0.2 1.019804 -0.1973956
# 6  0.0 -1  1  0.0 1.000000  0.0000000

Current R solutions

head(df)
#      x  y fx   fy   radius      angle
# 1 -1.0 -1  1 -1.0 1.414214 -0.7853982
# 2 -0.8 -1  1 -0.8 1.280625 -0.6747409
# 3 -0.6 -1  1 -0.6 1.166190 -0.5404195
# 4 -0.4 -1  1 -0.4 1.077033 -0.3805064
# 5 -0.2 -1  1 -0.2 1.019804 -0.1973956
# 6  0.0 -1  1  0.0 1.000000  0.0000000

base R

with(df, {
  plot(NULL,
    xlab = "x", ylab = "y",
    xlim = range(x, x + fx), 
    ylim = range(y, y + fy)
  )
  arrows(x, y, x+fx, y+fy)
})

ggplot2

ggplot(df, aes(x = x, y = y)) +
  geom_segment(
    aes(xend = x + fx, yend = y + fy),
    arrow = arrow()
  )

Current R solutions

ggquiver

library("ggquiver")

ggplot(df, aes(x = x, y = y)) +
  geom_quiver(aes(u = -y, v = x))

metR

library("metR")

ggplot(df, aes(x = x, y = y)) +
  geom_streamline(aes(dx = fx, dy = fy))  

Current R solutions

ggfields

library("ggfields")

df |> 
  mutate(angle = angle - pi/2) |> 
  ggplot(aes(x = x, y = y)) +
  ggfields::geom_fields(aes(angle = angle, radius = radius))

Current Python solutions

import numpy as np
import matplotlib.pyplot as plt

# Create a grid of points
x = np.linspace(-10, 10, 20)
y = np.linspace(-10, 10, 20)
X, Y = np.meshgrid(x, y)

# Define the vector field
U = -Y
V = X

# Plot the vector field
plt.quiver(X, Y, U, V)
plt.show()

Current Mathematica solutions

f[x_, y_] := {-y, x}
VectorPlot[f[x, y], {x, -10, 10}, {y, -10, 10}]

Mathematica 11.3

Mathematica 13.1

Visualizing vector fields


Challenges with results

Unattractive results Could be more visually appealing

Not informative Lacks clarity or detail


Challenges with code

Verbose syntax Very clunky and wordy

High user burden Lots of effort and mental load

Difficult to customize Hard to see now; examples to come

Not extensible Faceting? Superimposing different colors? etc.

Ridged Accept data one way


The results should be…

Correct Does what it claims to do

Clear Easily understandable

Attractive Visually appealing


The code should be…

Simple Straightforward and concise

Familiar Easy for a ggplot user

Flexible Accept functions or data of any form

The last we talked

Item          PPP Defense
Gas Price (per gallon) $3.25 $3.00
Milk Price (per gallon) $4.03 $4.02
Egg Price (per dozen) $3.17 $4.90

The last we talked

Item          PPP Defense
Gas Price (per gallon) $3.25 $3.00
Milk Price (per gallon) $4.03 $4.02
Egg Price (per dozen) $3.17 $4.90
Dallas Cowboys Super Bowls Since 1996 0 0
Taylor Swift’s Boyfriend Travis Kelce Travis Kelce

The last we talked

Item          PPP Defense
Gas Price (per gallon) $3.25 $3.00
Milk Price (per gallon) $4.03 $4.02
Egg Price (per dozen) $3.17 $4.90
Dallas Cowboys Super Bowls Since 1996 0 0
Taylor Swift’s Boyfriend Travis Kelce Travis Kelce
ggvfields

Stream Fields

ggvfields

ggvfields


Install from CRAN

install.packages("ggvfields")

Install from Github

remotes::install_github("dusty-turner/ggvfields")

Load the package

library("ggvfields")

There have been 171 CRAN installs since the debut on 15 March 2025.

ggvfields

Inputs

Users will come with vector data
- x, y, fx, fy
- x, y, xend, yend
- x, y, angle, direction
- x, y, angle_deg, direction

#      x  y fx   fy   radius      angle xend yend
# 1 -1.0 -1  1 -1.0 1.414214 -0.7853982  0.0 -2.0
# 2 -0.8 -1  1 -0.8 1.280625 -0.6747409  0.2 -1.8
# 3 -0.6 -1  1 -0.6 1.166190 -0.5404195  0.4 -1.6
# 4 -0.4 -1  1 -0.4 1.077033 -0.3805064  0.6 -1.4
# 5 -0.2 -1  1 -0.2 1.019804 -0.1973956  0.8 -1.2

Stream data
- x, y, t

#      x  y t
# 1 -1.0 -1 1
# 2 -0.8 -1 2
# 3 -0.6 -1 3
# 4 -0.4 -1 4
# 5 -0.2 -1 5

And functions
- function(u) c(-u[2], u[1])
- function(u) c(sin(u[2] + u[1]), cos(u[2] - u[1]))

f <- function(u) c(-u[2], u[1])

Visualization Types

Users will want both vector fields and stream fields.

Streams (and vectors) often arise from ODEs

Consider the vector field

\[ \textbf{f}(x,y) = (-y,\,x) \]

which yields the ODE system

\[ \frac{dx}{dt} = -y,\quad \frac{dy}{dt} = x \]

Euler’s method estimates the ODE solution by

\[ \mathbf{v}_{n+1} = \mathbf{v}_n + \Delta t\,\textbf{f}(\mathbf{v}_n) \]

Starting at \(\mathbf{v}_{0} = (1,1)\) and using \(\Delta t = 0.1\):

  • Iteration 1:
    \(\textbf{f}(1,1) = (-1,\,1)\), so
    \(\textbf{v}_{1} = (1,1) + 0.1\,(-1,1) = (0.9,\,1.1)\).

  • Iteration 2:
    \(\textbf{f}(0.9,1.1) = (-1.1,\,0.9)\), so
    \(\textbf{v}_2 = (0.9,1.1) + 0.1\,(-1.1,0.9) = (0.9-0.11,\,1.1+0.09) = (0.79,\,1.19)\).

library("deSolve")
# Set ode function
f_ode <- function(t, state, parms) list( f(state) )

# Set the initial state and define time sequence
init <- c(x = 1, y = 1)
times <- seq(0, 1, by = 0.1)
# Integrate the system using ode
(stream <- ode(init, times, f_ode, parms = NULL, method = "euler"))
#    time           x        y
# 1   0.0  1.00000000 1.000000
# 2   0.1  0.90000000 1.100000
# 3   0.2  0.79000000 1.190000
# 4   0.3  0.67100000 1.269000
# 5   0.4  0.54410000 1.336100
# 6   0.5  0.41049000 1.390510
# 7   0.6  0.27143900 1.431559
# 8   0.7  0.12828310 1.458703
# 9   0.8 -0.01758719 1.471531
# 10  0.9 -0.16474031 1.469772
# 11  1.0 -0.31171756 1.453298

Stream Data

ggplot(stream, aes(x = x, y = y, color = time)) +
  geom_path(arrow = arrow()) + 
  geom_point()

Stream Data

ggplot(stream, aes(x = x, y = y, t = time)) +
  geom_stream()

Vector Data

Reminder: A vector is a special case of a stream

init <- c(x = 1, y = 1); times <- seq(0, 1, by = 1)

ode(init, times, f_ode, parms = NULL, method = "euler")
#   time x y
# 1    0 1 1
# 2    1 0 2

Reshaping Happens

vector
# # A tibble: 1 × 9
#   x_start x_end y_start y_end    fx    fy angle angle_deg distance
#     <dbl> <dbl>   <dbl> <dbl> <dbl> <dbl> <dbl>     <dbl>    <dbl>
# 1       1     0       1     2    -1     1  2.36       135     1.41
p <- ggplot(vector, aes(x = x_start, y = y_start)) 
p1 <- p + geom_segment(aes(xend = x_end, yend = y_end), arrow = arrow())
p2 <- p + geom_vector (aes(xend = x_end, yend = y_end), normalize = FALSE, center = FALSE)
p1 + p2

p3 <- p + geom_vector(aes(xend = x_end, yend = y_end), normalize = FALSE, center = FALSE)
p4 <- p + geom_vector(aes(fx = fx, fy = fy), normalize = FALSE, center = FALSE)
p5 <- p + geom_vector(aes(angle = angle, distance = distance), normalize = FALSE, center = FALSE)
p6 <- p + geom_vector(aes(angle_deg = angle_deg, distance = distance), normalize = FALSE, center = FALSE)

(p3 + p4) / (p5 + p6) + plot_layout(guides = "collect")

Many Streams

streams <-
  expand.grid(x = seq(-1,1,.5), y = seq(-1,1,.5)) |> 
  apply(1, generate_path, step = .1) |> 
  bind_rows(.id = "id")

Streams

# # A tibble: 4 × 3
#    time      x     y
#   <dbl>  <dbl> <dbl>
# 1   0   -1     -1   
# 2   0.1 -0.9   -1.1 
# 3   0.2 -0.79  -1.19
# 4   0.3 -0.671 -1.27
# # A tibble: 4 × 3
#    time      x     y
#   <dbl>  <dbl> <dbl>
# 1   0   -0.5   -1   
# 2   0.1 -0.4   -1.05
# 3   0.2 -0.295 -1.09
# 4   0.3 -0.186 -1.12
# # A tibble: 4 × 3
#    time     x     y
#   <dbl> <dbl> <dbl>
# 1   0   0     -1   
# 2   0.1 0.1   -1   
# 3   0.2 0.2   -0.99
# 4   0.3 0.299 -0.97
ggplot(streams) + 
  geom_stream(aes(x = x, y = y, t = time, group = id))

Many Vectors

vectors <-
  expand.grid(x = seq(-1,1,.5), y = seq(-1,1,.5)) |> 
  apply(1, generate_path, step = 1) |> bind_rows(.id = "id")

Vectors

# # A tibble: 8 × 4
# # Groups:   id [4]
#   id     time     x     y
#   <chr> <dbl> <dbl> <dbl>
# 1 1         0  -1    -1  
# 2 1         1   0    -2  
# 3 2         0  -0.5  -1  
# 4 2         1   0.5  -1.5
# 5 3         0   0    -1  
# 6 3         1   1    -1  
# 7 4         0   0.5  -1  
# 8 4         1   1.5  -0.5

Vectors Wide

# # A tibble: 4 × 5
# # Groups:   id [4]
#   id    x_start x_end y_start y_end
#   <chr>   <dbl> <dbl>   <dbl> <dbl>
# 1 1        -1     0        -1  -2  
# 2 2        -0.5   0.5      -1  -1.5
# 3 3         0     1        -1  -1  
# 4 4         0.5   1.5      -1  -0.5
p1 <- ggplot(vectors, aes(x = x, y = y, t = time, group = id)) + 
  geom_stream()

p2 <- ggplot(vectors_wide, aes(x = x_start, y = y_start, xend = x_end, yend = y_end)) +
  geom_vector(normalize = FALSE, center = FALSE)

p1 / p2

Stream Fields

f <- function(u) {
  x <- u[1]; y <- u[2]
  c(-y, x)
}

ggplot() + geom_stream_field(fun = f, xlim = c(-1,1), ylim = c(-1,1))

Stream Fields

f <- function(u) {
  x <- u[1]; y <- u[2]
  c(-y, x)
}

ggplot() + geom_stream_field(fun = f)

Vector Fields

f <- function(u) {
  x <- u[1]; y <- u[2]
  c(-y, x)
}

ggplot() + geom_stream_field(fun = f, type = "vector")

Vector Fields

f <- function(u) {
  x <- u[1]; y <- u[2]
  c(-y, x)
}

ggplot() + geom_vector_field(fun = f)

Vector and Stream Fields

f <- function(u) {
  x <- u[1]; y <- u[2]
  c(-y, x)
}

ggplot() + 
  geom_vector_field(fun = f, n = 3, color = university_gold, linewidth = 3) + 
  geom_stream_field(fun = f, n = 3, color = baylor_green, linewidth = 3)

Visualizing \(\mathbb{R}^2 \to \mathbb{R}^2\) is Imperfect

There is too much information for just two 2 aesthetics

Many attempts to address this provide understanding in some ways but lose understanding in others

Stream Options: center = TRUE

Recall the ODE system: \[ \textbf{f}(x,y) = (-y,\,x) \quad \text{(i.e., } \frac{dx}{dt}=-y,\; \frac{dy}{dt}=x\text{)} \]

Euler’s method estimates the solution: \[ \mathbf{v}_{n+1} = \mathbf{v}_n + \Delta t\,\mathbf{f}(\mathbf{v}_n) \]

center = FALSE:
\(\Delta t\): from \(0\) to \(T\)

center = TRUE:
\(+ \Delta t\): from \(0\) to \(T/2\)
\(- \Delta t\): from \(0\) to \(-T/2\)

Stream Options: center = TRUE

ggplot() +
  geom_stream_field(fun = f, n = 3, tail_point = TRUE) +
  geom_stream_field(fun = f, n = 3, tail_point = TRUE, center = FALSE) 

Stream Options: L

p1 <- ggplot() + geom_stream_field(fun = f, n = 4) 
p2 <- ggplot() + geom_stream_field(fun = f, n = 4, L = .3) 
p3 <- ggplot() + geom_stream_field(fun = f, n = 4, L = 3) 

p1 + p2 + p3 + coord_equal() + plot_layout(guides = "collect")

Stream Options: normalize = FALSE

normalize = TRUE:
- Default: Streams grow to length L provided by user or determined algorithmically
normalize = FALSE:
- Default: Streams grow to length L, then all are trimmed to the fastest time T
- Customization: Set T to control stream duration

p1 <- ggplot() + geom_stream_field(fun = f, n = 4, normalize = FALSE) 
p2 <- ggplot() + geom_stream_field(fun = f, n = 4, normalize = FALSE, L = 0.3) 
p3 <- ggplot() + geom_stream_field(fun = f, n = 4, normalize = FALSE, T = 0.3) 
p1 + p2 + p3 + coord_equal() + plot_layout(guides = "collect")

Vector Options: center = FALSE

center = FALSE: An arrow is drawn from \((x,y)\) to \((x,y) + f(x,y)\)

center = TRUE: an arrow is drawn from \((x,y) - \frac{1}{2} f(x,y)\) to \((x,y) + \frac{1}{2} f(x,y)\)

ggplot() +
  geom_vector_field(fun = f, n = 4, tail_point = TRUE, linewidth = 7, color = baylor_green) +
  geom_vector_field(fun = f, n = 4, tail_point = TRUE, center = FALSE, linewidth = 3, color = university_gold) 

Vector Options: L

normalize = TRUE: Vectors are scaled to a fixed length L (either user-specified or computed automatically)

p1 <- ggplot() + geom_vector_field(fun = f, n = 4) 
p2 <- ggplot() + geom_vector_field(fun = f, n = 4, L = .3)
p3 <- ggplot() + geom_vector_field(fun = f, n = 4, L = 3)

p1 + p2 + p3 + coord_equal() + plot_layout(guides = "collect")

Vector Options: normalize = FALSE

normalize = TRUE: Vectors are scaled to a fixed length L (either user-specified or computed automatically)

normalize = FALSE: Vectors retain their natural magnitude—that is, they start at \((x,y)\) and extend to \((x,y) + f(x,y)\) (with \(\Delta T = 1\))

p1 <- ggplot() + geom_vector_field(fun = f, n = 4) 
p2 <- ggplot() + geom_vector_field(fun = f, n = 4, normalize = FALSE)

p1 + p2 + coord_equal() + plot_layout(guides = "collect")

Visualizing \(\mathbb{R}^2 \to \mathbb{R}^2\) is Imperfect

Visualizing \(\mathbb{R}^2 \to \mathbb{R}^2\) is Imperfect

What if we could do the same thing with vectors?

Mapping Vector Norm to Length

ggplot() + 
  geom_vector_field(aes(length = after_stat(norm)), 
                    fun = f, 
                    color = "black",
                    center = FALSE,
                    tail_point = TRUE,
                    arrow = NULL
                    )

Mapping Vector Norm to Length

ggplot() + geom_vector_field2(fun = f)

Mapping Vector Norm to Length

Mapping Vector Norm to Length

ggplot() + geom_vector_field2(fun = f) + scale_length_continuous(max_range = 2)

Gradients as Stream Fields

Consider the scalar function

\[ \phi(x,y) = \frac{1}{2}\left(x^2 + y^2\right) \]

Its gradient is a vector field is given by

\[ \nabla\phi(x,y) = \left(\frac{\partial \phi}{\partial x},\,\frac{\partial \phi}{\partial y}\right) = \left(x,\,y\right) \]

f <- function(u) (1/2) * (u[1]^2 + u[2]^2)
p1 <- ggplot() + ggfunction::geom_function_2d_1d(fun = f) ## sneak peak of chapter 3
p2 <- ggplot() + geom_gradient_field(fun = f)
p1 + p2 + coord_equal()

Gradients as Stream Fields

Consider the scalar function

\[ \phi(x,y) = \frac{1}{2}\left(x^2 + y^2\right) \]

Its gradient is a vector field is given by

\[ \nabla\phi(x,y) = \left(\frac{\partial \phi}{\partial x},\,\frac{\partial \phi}{\partial y}\right) = \left(x,\,y\right) \]

ggplot() + ggfunction::geom_function_2d_1d(fun = f) + geom_gradient_field(fun = f) + coord_equal()

From Vector Field to Scalar Field

Consider the scalar function

\[ \phi(x,y) = \frac{1}{2}\left(x^2 + y^2\right) \]

Its gradient is a vector field is given by

\[ \nabla\phi(x,y) = \left(\frac{\partial \phi}{\partial x},\,\frac{\partial \phi}{\partial y}\right) = \left(x,\,y\right) \]


Express the potential function as a line integral:

\[ \phi(x,y)=\int_{(x_{start},y_{start})}^{(x,y)} \nabla \phi \cdot (dx,dy). \]

Integrate the x-component: \[ \phi(x,y)=\int x\,dx = \frac{1}{2}x^2 + g(y) \]

Differentiate with respect to y: \[ \frac{\partial}{\partial y}\phi(x,y)=\frac{\partial\phi}{\partial y}\left(\frac{1}{2}x^2+g(y)\right) = \frac{\partial\phi}{\partial y}g(y) = \frac{\partial \phi}{\partial y}=g'(y) = y \]

Solve for g(y) by integrating: \[ g(y) = \int g'(y)\,dy = \int y\,dy = \frac{1}{2}y^2 + C \]

So: \[ \phi(x,y) = \frac{1}{2}x^2+g(y) = \frac{1}{2}x^2+\frac{1}{2}y^2+C \]

grad_f <- function(u) c(u[1],u[2])
ggplot() + 
  geom_potential(fun = grad_f) + 
  geom_vector_field(fun = grad_f) +
  coord_equal()

Real World Data

Coulomb’s law gives the force between two charges

\[ \textbf{F} = \frac{q_{1} q_{2}}{\left|\left|\textbf{u}_{1}-\textbf{u}_{2}\right|\right|_{}^{3}}(\textbf{u}_{1}-\textbf{u}_{2}) \]

If multiple charges are present, superposition applies

\[ \textbf{F}(\textbf{u}) = \sum_{i=1}^{2} \frac{q q_{i}}{\left|\left|\textbf{u}-\textbf{u}_{i}\right|\right|_{}^{3}}(\textbf{u}-\textbf{u}_{i}) \]

Example:

\(q_{1} = +1\) at \((1,1)\)
\(q_{2} = -1\) at \((-1,-1)\)
Test charge \(q = +1\) at \((x, y)\)

\[ \textbf{f}(x, y) = \frac{1}{((x-1)^2+(y-1)^2)^{3/2}} \left[\begin{matrix}x-1 \\ y-1 \end{matrix}\right] - \frac{1}{((x+1)^2+(y+1)^2)^{3/2}} \left[\begin{matrix}x+1 \\ y+1 \end{matrix}\right] \]

Real World Data

f <- efield_maker(log = TRUE, charge_positions = rbind(c(1,1), c(3,3)))
xlim <- c(0,4); ylim <- c(0,4)

p1 <- ggplot() + geom_vector_field(fun = f, xlim = xlim, ylim = ylim) 
p2 <- ggplot() + geom_vector_field(fun = f, xlim = xlim, ylim = ylim, normalize = FALSE, center = FALSE) 

(p1 + p2) + plot_layout(guides = "collect")

Random Vector Data

set.seed(123)
n <- 1000
pts <- data.frame(x = runif(n, 0, 4), y = runif(n, 0, 4))
noisy <- t(apply(pts, 1, function(u) f(u) + rnorm(2, 0, 0.05)))
rand_dat <- cbind(pts, setNames(as.data.frame(noisy), c("fx","fy"))) |> as_tibble()

p1 <- ggplot(rand_dat) + geom_vector(aes(x = x, y = y, fx = fx, fy = fy))
p2 <- ggplot(rand_dat) + geom_vector(aes(x = x, y = y, fx = fx, fy = fy), normalize = FALSE) 

p1 + p2 + plot_layout(guides = "collect")

Random Vector Data

General Additive Models

Each component is estimated using a smooth basis expansion:

\[ \widehat{f}_x(x, y) = \sum_{j=1}^{K_x} \beta_j^{(x)} \, \phi_j^{(x)}(x, y), \quad \widehat{f}_y(x, y) = \sum_{j=1}^{K_y} \beta_j^{(y)} \, \phi_j^{(y)}(x, y) \]

\(\phi^{(x)}(x, y)\) and \(\phi^{(y)}(x, y)\) as the basis function vectors at location \((x, y)\) for each model.

\(\beta^{(x)}\) and \(\beta^{(y)}\) are the coefficient vectors estimated using least squares.

If \(\Sigma_x\) and \(\Sigma_y\) be the covariance matrices of \(\beta^{(x)}\) and \(\beta^{(y)}\). Then

\[ \operatorname{Var}[\widehat{\textbf{f}}(x, y)] = \begin{bmatrix} \phi^{(x)}(x, y)^T \Sigma_x \, \phi^{(x)}(x, y) & 0 \\ 0 & \phi^{(y)}(x, y)^T \Sigma_y \, \phi^{(y)}(x, y) \end{bmatrix} \]

General Additive Models

p <- ggplot(rand_dat, aes(x = x, y = y, fx = fx, fy = fy)) +
  geom_vector(alpha = .1, color = "black") +
  geom_point(data = eval_point, aes(x = x, y = y), color = "red", size = 3, inherit.aes = FALSE) +
  coord_equal()

p + geom_vector_smooth(eval_points = eval_point, se = FALSE)  

Measuring Uncertainty

The covariance matrix of a predicted vector is defined as:

\[ \boldsymbol{\Sigma}_{\widehat{\mathbf{v}}} = \mathbf{X}_{\text{new}}\,\mathbf{V}\,\mathbf{X}_{\text{new}}^\top + \boldsymbol{\Sigma}, \quad \mathbf{V} = \sigma^2 (\mathbf{X}^\top \mathbf{X})^{-1}. \]

\(\boldsymbol{\Sigma}_{\widehat{\mathbf{v}}}\) is the covariance matrix of the predicted vector field at a new location.
\(\mathbf{X}_{\text{new}}\) is the design matrix at the new points.
\(\mathbf{V}\) is the covariance matrix of the regression coefficients (\(\mathbf{\beta}\)).
\(\boldsymbol{\Sigma}\) is the observation noise or residual variance.

Eigen-decomposition:

\[ \boldsymbol{\Sigma}_{\widehat{\mathbf{v}}} = \mathbf{Q}\,\boldsymbol{\Lambda}\,\mathbf{Q}^\top, \quad \boldsymbol{\Lambda} = \operatorname{diag}(\lambda_1,\lambda_2). \]

Semi-axis lengths and orientation:

\[ a_i = \sqrt{\lambda_i\,\chi^2_{1-\alpha}}, \quad i=1,2, \qquad \theta = \arctan\left(\frac{q_{2,1}}{q_{1,1}}\right). \]

General Additive Models Measuring Uncertainty

p + geom_vector_smooth(eval_points = eval_point) 

Multivariate Reggsion

\(\mathbf{Y} = \mathbf{X}\mathbf{B} + \mathbf{E}\)

where

  • \(\mathbf{Y}\) is the response matrix (with columns \(f_x\) and \(f_y\)),
  • \(\mathbf{X}\) is the design matrix (with columns \(1\), \(x\), \(y\), and \(xy\)),
  • \(\mathbf{B}\) contains the regression coefficients, and
  • \(\mathbf{E}\) is the error matrix.

The least squares estimator is

\(\widehat{\mathbf{B}} = (\mathbf{X}^\top \mathbf{X})^{-1} \mathbf{X}^\top \mathbf{Y}.\)

Multivariate Reggsion

p + geom_vector_smooth(eval_points = eval_point, se = FALSE, method = "lm") 

Multivariate Reggsion Measuring Uncertainty

p + geom_vector_smooth(eval_points = eval_point, method = "lm") 

Cokriging

Cokriging is a technique that extends ordinary kriging to jointly predict multiple, correlated spatial variables. It leverages the spatial cross-correlation between the variables to improve prediction accuracy.

Cokriging

p + geom_vector_smooth(eval_points = eval_point, se = FALSE, method = "kriging") 
# Linear Model of Coregionalization found. Good.
# [using ordinary cokriging]

Cokriging Measuring Uncertainty

p + geom_vector_smooth(eval_points = eval_point, method = "kriging") 
# Linear Model of Coregionalization found. Good.
# [using ordinary cokriging]

Predictions across the field

p + geom_vector_smooth(se = FALSE) 

Measuring Uncertainty

\[ r = \sqrt{f_x^2 + f_y^2}, \quad \theta = \operatorname{atan2}(f_y, f_x), \] with Jacobian determinant \(|r|\). The joint density becomes:

\[ f_{r,\theta}(r,\theta) = f_{f_x,f_y}(r\cos\theta,\,r\sin\theta)\,|r|. \] Marginalize over r to get angular density:

\[ f_\theta(\theta) = \int_start^\infty f_{r,\theta}(r,\theta)\,dr, \qquad F_\theta(\theta) = \int_{-\pi}^\theta f_\theta(u)\,du. \] Confidence bounds for angular uncertainty:

\[ F_\theta(\theta_{\text{lower}}) = \frac{\alpha}{2}, \quad F_\theta(\theta_{\text{upper}}) = 1 - \frac{\alpha}{2}. \]

General Additive Models Measuring Uncertainty

ggplot(rand_dat, mapping = aes(x = x, y = y, fx = fx, fy = fy)) +
  geom_vector_field(fun = f, alpha = .3, color = "black", center = FALSE) +
  geom_vector_smooth(pi_type = "wedge")

Multivariate Reggsion Measuring Uncertainty

ggplot(rand_dat, mapping = aes(x = x, y = y, fx = fx, fy = fy)) +
  geom_vector_field(fun = f, alpha = .3, color = "black", center = FALSE) +
  geom_vector_smooth(pi_type = "wedge", method = "lm")

Cokrieging Measuring Uncertainty

ggplot(rand_dat, aes(x = x, y = y, fx = fx, fy = fy)) +
  geom_vector_field(fun = f, alpha = .3, color = "black", center = FALSE) +
  geom_vector_smooth(pi_type = "wedge", method = "kriging") 
# Linear Model of Coregionalization found. Good.
# [using ordinary cokriging]

Comparison

p <- ggplot(rand_dat, aes(x = x, y = y, fx = fx, fy = fy)) +
  geom_vector_field(fun = f, alpha = .3, color = "black", center = FALSE) 
  # geom_vector(alpha = .3, color = "black") 
p1 <- p + geom_vector_smooth(eval_points = eval_point, se.circle = TRUE) 
p2 <- p + geom_vector_smooth(eval_points = eval_point, se.circle = TRUE, method = "lm") 
p3 <- p + geom_vector_smooth(eval_points = eval_point, se.circle = TRUE, method = "kriging") 

p1 + p2 + p3 + plot_layout(guides = "collect")
# Linear Model of Coregionalization found. Good.
# [using ordinary cokriging]

Multivariate Regression as an ODE

What if we saw this regression equation as a \(\mathbb{R}^2 \to \mathbb{R}^2\) vector field?

After all, it takes in x/y data and outputs fx/fy data?

vec_field <- function(u) {
  model <- lm(cbind(fx,fy)~x*y, data = rand_dat)
  newdata <- data.frame(x = u[1], y = u[2])
  as.numeric(predict(model, newdata = newdata))
}

Multivariate Regression as an ODE

vec_field <- function(u) {
  model <- lm(cbind(fx,fy)~x*y, data = rand_dat)
  newdata <- data.frame(x = u[1], y = u[2])
  as.numeric(predict(model, newdata = newdata))
}

ggplot(data = rand_dat, aes(x = x, y = y, fx = fx, fy = fy)) + 
  geom_vector_field(fun = vec_field, xlim = c(0,4), ylim = c(0,4)) 

Multivariate Regression as an ODE

ggplot(data = rand_dat, aes(x = x, y = y, fx = fx, fy = fy)) +
  geom_stream_smooth()

Multivariate Regression as an ODE

p1 <- ggplot(data = rand_dat, aes(x = x, y = y, fx = fx, fy = fy)) +
  geom_stream_smooth()
p2 <- ggplot(data = rand_dat, aes(x = x, y = y, fx = fx, fy = fy)) +
  geom_stream_smooth(type = "stream") 

p1 + p2

Multivariate Regression as an ODE

ggplot(data = rand_dat, aes(x = x, y = y, fx = fx, fy = fy)) +
  geom_vector_field(fun = f, xlim = c(0,4), ylim = c(0,4), color = "black", center = FALSE) +
  geom_stream_smooth(type = "stream") 

Estimating Gradients from Scalar Data

ggplot(grid) +
  ggfunction::geom_function_2d_1d(aes(x = x, y = y, z = z)) + ## sneak peak
  geom_point(data = sample_data, aes(x = x, y = y, size = z)) 

Estimating Gradients from Scalar Data

scalar_field <- function(u) {
  mod <- lm(z ~ x + y + poly(x,2) + poly(y,2), data = sample_data)
  pred <- predict(mod, newdata = data.frame(x = u[1], y = u[2]))
  as.numeric(pred)
}  

# plot the scalar data with the estimated gradient 
ggplot(grid) +
  ggfunction::geom_function_2d_1d(aes(x = x, y = y, z = z)) + ## sneak peak
  geom_gradient_field(fun = scalar_field) +
  scale_color_viridis_c()

Estimating Gradients from Scalar Data

ggplot(grid) +
  ggfunction::geom_function_2d_1d(aes(x = x, y = y, z = z)) + ## sneak peak
  geom_gradient_smooth(xlim = c(0,pi), ylim = c(-pi/2,pi/2)) + ## need to make this inheret from data
  scale_color_viridis_c()

Estimating Gradients from Scalar Data

ggmap(basemap) +
  geom_sf(data = waco_tx, aes(colour = elevation), inherit.aes = FALSE, size = 0.3) +
  scale_color_viridis_c() +
  ggthemes::theme_map() +
  theme(legend.position = "right", legend.box = "vertical") 

Estimating Gradients from Scalar Data

map + geom_gradient_smooth(
    data = contour_with_elevation, 
    aes(x = X, y = Y, z = elevation),
    formula = z ~ x + y + poly(x,2) * poly(y,2)
  ) 

Something Cool

f <- function(v) {
  x <- v[1]; y <- v[2]
  c(-y + .2 * x, x + .2 * y)
}

ggplot() +
  geom_stream_field(fun = f, xlim = c(-3,3), ylim = c(-3,3),  
                    arrow = NULL, center = FALSE, normalize = FALSE,  
                    L = 10, n = 5, tail_point = TRUE, grid = "hex"
                    ) + theme_void()

Something Cool

library(gganimate)

p <- ggplot() +
  geom_stream_field(fun = f, xlim = c(-3,3), ylim = c(-3,3),  
                    arrow = NULL, center = FALSE, normalize = FALSE,  
                    n = 20, 
                    L = 10, 
                    tail_point = TRUE,
                    grid = "hex"
                    ) + 
  theme_void() +
  transition_reveal(along = after_stat(t))

animate(p, nframes = 30, fps = 10, duration = 10)

A new problem

ggplot2 Layer Expectations

tibble(x = seq(-3,3,.1), y = dnorm(x)) |> 
  ggplot(aes(x = x, y = y)) +
  geom_line()

data <- tibble(x = seq(-3,3,.1), y = dnorm(x))

data |> 
  ggplot(aes(x = x, y = y)) +
  geom_line() +
  geom_area(data = data[data$x < qnorm(.95),])

ggplot2 Layer Expectations

Data

geom_point()
geom_line()
geom_segment()
geom_smooth()
geom_stream()

Functions

geom_function()
geom_stream_field()

Solution - ggfunction


Install from Github

remotes::install_github("dusty-turner/ggfunction")

Load the package

library("ggfunction")

\(\mathbb{R} \to \mathbb{R}\)

ggplot() + geom_function(fun = dnorm, xlim=c(-3, 3))

\(\mathbb{R} \to \mathbb{R}\): Probability Distributions

A PDF \(f_X(x)\) satisfies

\(\Pr(a \le X \le b) = \int_a^b f_X(x) \, dx, \quad \int_{-\infty}^{\infty} f_X(x) \, dx = 1\)

For example, the standard normal PDF is:

\(f(x) = \frac{1}{\sqrt{2\pi}} \exp\left(-\frac{x^2}{2}\right)\)

ggplot() +
  geom_pdf(fun = dnorm, xlim = c(-3, 3), p = 0.95, fill = "grey70")

\(\mathbb{R} \to \mathbb{R}\): Cumulative Distribution Functions

The CDF \(F_X(x)\) is defined as

\[ F_X(x) = \Pr(X \le x) = \int_{-\infty}^{x} f_X(t) \, dt. \]

It is non-decreasing and maps \(\mathbb{R}\) to \([0,1]\).

ggplot() +
  geom_cdf(fun = pnorm, xlim = c(-3, 3), p = 0.975, fill = "grey70")

\(\mathbb{R} \to \mathbb{R}\): Quantile Functions

The quantile function \(Q(p)\) is given by:

\[ Q(p) = \inf\{ x \in \mathbb{R} : F_X(x) \ge p \}, \quad p \in [0,1]. \]

ggplot() +
  geom_qf(fun = qnorm)

\(\mathbb{R} \to \mathbb{R}^2\): Parametric Curves

Parametric curves define a mapping from a scalar parameter \(t\) to 2D coordinates:

\[ x = x(t), \quad y = y(t). \]

For example, the unit circle can be defined as:

\[ x(t) = \sin t,\quad y(t) = \cos t,\quad t \in [0,2\pi]. \]

f <- function(t) c(sin(t), cos(t))
ggplot() + 
  geom_function_1d_2d(fun = f, T = 2*pi)

\(\mathbb{R}^2 \to \mathbb{R}\): Scalar Fields

A scalar field maps 2D input to a scalar output:

\[ f : \mathbb{R}^2 \to \mathbb{R}, \quad f(x,y) = z. \]

The function’s value at each \((x,y)\) can be visualized with color gradients.

Visualize the scalar field \(f(x,y)=\sin(x)\cos(y)\) with:

f <- function(v) sin(v[1]) * cos(v[2])
ggplot() +
  geom_function_2d_1d(fun = f, xlim = c(-3*pi,3*pi), ylim = c(-3*pi,3*pi), n = 100)

\(\mathbb{R}^2 \to \mathbb{R}^2\): Vector Fields

A wrapper for geom_stream_field()

f <- function(v) c(-v[2], v[1])
ggplot() +
  geom_function_2d_2d(fun = f, xlim = c(-3, 3), ylim = c(-3, 3))

In Summary

Completed

geom_stream()
geom_vector()

geom_stream_field() geom_vector_field() geom_gradient_field()

geom_potential()

geom_vector_smooth() geom_stream_smooth() geom_gradient_smooth()

Future

Publish Papers
Publicize Package
geom_complex_function()
geom_stream_field(se = TRUE)

Thank You!

Committee
- Dr David Kahle
- COL (R) Rod Sturdivant, PhD
- Dr Keith Richards
- Dr Joshua Patrick

Peers

West Point

Family


github.com/dusty-turner/ggvfields